Detect differentially methylated regions using non-homogeneous hidden Markov model for methylation array data
نویسندگان
چکیده
Motivation DNA methylation is an important epigenetic mechanism in gene regulation and the detection of differentially methylated regions (DMRs) is enthralling for many disease studies. There are several aspects that we can improve over existing DMR detection methods: (i) methylation statuses of nearby CpG sites are highly correlated, but this fact has seldom been modelled rigorously due to the uneven spacing; (ii) it is practically important to be able to handle both paired and unpaired samples; and (iii) the capability to detect DMRs from a single pair of samples is demanded. Results We present DMRMark (DMR detection based on non-homogeneous hidden Markov model), a novel Bayesian framework for detecting DMRs from methylation array data. It combines the constrained Gaussian mixture model that incorporates the biological knowledge with the non-homogeneous hidden Markov model that models spatial correlation. Unlike existing methods, our DMR detection is achieved without predefined boundaries or decision windows. Furthermore, our method can detect DMRs from a single pair of samples and can also incorporate unpaired samples. Both simulation studies and real datasets from The Cancer Genome Atlas showed the significant improvement of DMRMark over other methods. Availability and implementation DMRMark is freely available as an R package at the CRAN R package repository. Contact [email protected]. Supplementary information Supplementary data are available at Bioinformatics online.
منابع مشابه
Spatially Enhanced Differential RNA Methylation Analysis from Affinity-Based Sequencing Data with Hidden Markov Model
With the development of new sequencing technology, the entire N6-methyl-adenosine (m(6)A) RNA methylome can now be unbiased profiled with methylated RNA immune-precipitation sequencing technique (MeRIP-Seq), making it possible to detect differential methylation states of RNA between two conditions, for example, between normal and cancerous tissue. However, as an affinity-based method, MeRIP-Seq...
متن کاملseqlm: an MDL based method for identifying differentially methylated regions in high density methylation array data
MOTIVATION One of the main goals of large scale methylation studies is to detect differentially methylated loci. One way is to approach this problem sitewise, i.e. to find differentially methylated positions (DMPs). However, it has been shown that methylation is regulated in longer genomic regions. So it is more desirable to identify differentially methylated regions (DMRs) instead of DMPs. The...
متن کاملDM-BLD: differential methylation detection using a hierarchical Bayesian model exploiting local dependency
MOTIVATION The advent of high-throughput DNA methylation profiling techniques has enabled the possibility of accurate identification of differentially methylated genes for cancer research. The large number of measured loci facilitates whole genome methylation study, yet posing great challenges for differential methylation detection due to the high variability in tumor samples. RESULTS We have...
متن کاملMeDIP-HMM: genome-wide identification of distinct DNA methylation states from high-density tiling arrays
MOTIVATION Methylation of cytosines in DNA is an important epigenetic mechanism involved in transcriptional regulation and preservation of genome integrity in a wide range of eukaryotes. Immunoprecipitation of methylated DNA followed by hybridization to genomic tiling arrays (MeDIP-chip) is a cost-effective and sensitive method for methylome analyses. However, existing bioinformatics methods on...
متن کاملbiomvRCNS : Copy Number study and Segmentation for multivariate biological data
With high throughput experiments like tiling array and NGS, researchers are looking for continuous homogeneous segments or signal peaks, which would represent chromatin states, methylation ratio, transcripts or genome regions of deletion and amplification. While in a normal experimental set-up, these profiles would be generated for multiple samples or conditions with replicates. In the package ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 33 23 شماره
صفحات -
تاریخ انتشار 2017